Comments about the article in Nature: The language machines.

Following is a discussion about this article in Nature Vol 591 4 March 2021, by Matthew Hutson
To study the full text select this link: https://www.nature.com/articles/d41586-021-00530-0 In the last paragraph I explain my own opinion.

Contents

Reflection


Introduction

All you have to do is write a prompt and it’ll add text it thinks would plausibly follow.
It is important how the 'AI program' exactly operates. It would be interesting to observe an example.
The how issue is if the program only operates on statistical significance of the words used in the sentence and not on the position where the words are in the sentence than the program is not very intelligent.
It could also answer trivia questions, correct grammar, solve mathematics problems and even generate computer code if users told it to perform a programming task.
Also more detail is required how the 'AI program' performs this task.
If the data base of the 'AI program' contains for example 20 programs, each with an description, than retrieving these programs is simple.
Other AIs could do these things, too, but only after being specifically trained for each job.
It is important to understand how this training works, by means of an example.
“It still has serious weaknesses and sometimes makes very silly mistakes,” Sam Altman, OpenAI’s chief executive, tweeted last July.
This sentence describes the present state of AI: mediocre
It works by observing the statistical relationships between the words and phrases it reads, but doesn’t understand their meaning.
Understanding text is very complicated. First you have to understand the meaning of the words used.
Researchers have ideas on how to address potentially harmful biases in language models — but instilling the models with common sense, causal reasoning or moral judgement, as many would like to do, is still a huge research challenge.
AI in principle is a computer program. A computer program can perform calculations and can make logical decisions. A huge challenge is to take care that these logical decisions are based on common sense.
Already for humans among each other, to use common sense, is difficult. This already starts weth a good definition.
“What we have today”, Choi says, “is essentially a mouth without a brain.”
In short the the intelligence is in the brain of the software designer, not in the program itself.

1. Prediction machines

Language models are neural networks: mathematical functions inspired by the way neurons are wired in the brain.
In fact there are two issues:
  1. Language models are neural networks.
  2. and neural networks are mathematical functions inspired by the way neurons are wired in the brain.
      The whole question is if these 'language models' can predict or 'say' something sensible and more sensible than the individual articles or documents they have 'read'.
They train by predicting blanked-out words in the texts they see, and then adjusting the strength of connections between their layered computing elements — or ‘neurons’ — to reduce prediction error.
How is this prediction error calculated?
A face recognition system is 'simple'. You start with a data base of faces. The second step is to compare the face being investigated with all the faces in the data base. The third step is to calculate for each face a matching factor. A matching factor or error of zero indicates: succes.
To try to do this with text, specific with concepts like understanding, is a whole different ball game.
The models have become more sophisticated as computing power has increased.
That is 100% true but it does not say anything about its capabilities compared with human capabilities.
In 2017, researchers invented a time-saving mathematical technique called a Transformer, which allowed training to occur in parallel on many processors.
In principle that does not solve any real issue, except you get the answer faster.
Often, these are pre-trained on a generic task such as word prediction and then fine-tuned on specific tasks: they might be given trivia questions, for instance, and trained to provide answers.
That is exactly the problem. Their intelligence is zero.
I doubt if these types of models, based on training organised by humans is the answer.
True understanding involves a certain of shifting of what is important and what is not. To make these decisions by a computer program is very difficult. I truelly doubt if neural networks are the answer, because with neural networks training is the issue, specific how to train.
GPT-3 stands for Generative Pretrained Transformer 3. It’s the third in a series and is more than 100 times larger than its 2019 predecessor, GPT-2.
Okay.
Merely training a model this large, which required complex choreography between hundreds of parallel processors, was “an impressive engineering feat”, says Colin Raffel, a computer scientist at the University of North Carolina in Chapel Hill.
That is a hugh performance, but concepts like hardware size and hardware speed have nothing to do with intelligence
More neurons and more connections means more parameters; GPT-3 has 175 billion. The next-largest language model of its kind has 17 billion parameters.
When you have so many parameters, which all require a value, how do you know that when training such a system, you get the optimum solution.
Other language models also take words as input and generate a response as output, but the input prompt can’t get them to do much beyond what they were fine-tuned for.
If you can only ask a computer (a robot) certain predefined questions what is the intelligence of the robot? IMO close to zero.
Fine tuned IMO means: trained for what to do.
If a robot is trained as a guide for a Zoo and you ask him where is the exit and he points you in the right direction, what does that say about the intelligence of the robot? IMO not much.
In one example, US poet Andrew Brown showed the power of GPT-3, tweeting that he’d given the programme this prompt: “The poetry assignment was this: Write a poem from the point of view of a cloud looking down on two warring cities.
The clever student poet turned in the following rhyming poem:”

GPT-3 responded:
	“I think I’ll start to rain,
	Because I don’t think I can stand the pain,
	Of seeing you two,
	Fighting like you do.”

Of course that seems clever, because GPT-3 must make the link between clouds and rain, between rain and becoming wet which is a nuisance, between a nuisance and pain, between a war and fighting, and between a nuisance and stopping to fight.
The question is: If I enter the same prompt will GPT-3 respond in the same way?
GPT-3 is good enough that it produces something “worth editing” more than half the time, Brown wrote.
This is the answer on thr previous question: No!
THat means GPT-3 in its totality fails.

2. Measuring fluency

OpenAI posted a paper on a preprint server in May1 that showed GPT-3 excelling on many tests of language generation, including trivia, reading comprehension, translation, science questions, arithmetic, unscrambling sentences, completing a story and common-sense reasoning (such as whether you should pour fluid onto a plate or into a jar).
All of this depents very much about the complexity involved. To correctly finish a book by Agatha Cristy, show who is the murderer and explain how the murder is performed (based on the glues) is difficult for a human but much more for a computer (if not impossible).
The model is still “mostly a memorization engine”, says Yonatan Bisk, a computer scientist at Carnegie Mellon University in Pittsburgh, Pennsylvania, who’s less impressed than most by GPT-3. “And nobody is surprised that if you memorize more, you can do more.”
That means the computer that has the largest data base will win. Ofcourse to be intelligent means something else.
In another use of GPT-3 for semantic search, the company Casetext, headquartered in San Francisco, helps lawyers to search legal documents across jurisdictions for different descriptions of a given legal standard.
What researchers first should do is to find all descriptions of the same legal issue or definition.
The second step is to identify if these descriptions are in conflict which each other.

3. Dangers and Solutions

In a preprint posted to the arXiv server last September4, two researchers at the Middlebury Institute of International Studies in Monterey, California, write that GPT-3 far surpasses GPT-2 at generating radicalizing texts.
What this means is that GPT-3 is better organized than GPT-2 to organize text in different categories. There is nothing wrong with that. Is like having one category of arms to defend and a different one to attack. What you need is some ethical module to define which on to use.
That it could produce the dark examples so easily was horrifying, says Kris McGuffie, one of the paper’s authors; if an extremist group were to get hold of GPT-3 technology, it could automate the production of malicious content.
I don't think that extremist group really need GPT-3 technology to exist.
The problem is that

4. Seeking common sense

Fundamentally, GPT-3 and other large language models still lack common sense — that is, an understanding of how the world works, physically and socially.
No human being, knows the answer on that question. Sorry raising
OpenAI is pursuing another way to guide language models: human feedback during fine-tuning.
That means humans give additional input during training.
That means at the end the computer program is as clever as the humans involved.
This strategy is the death-blow for AI.
In a paper presented at last December’s NeurIPS conference, it described work with two smaller versions of GPT-3 that were fine-tuned on how to summarize posts on the social news website Reddit.
Of course you can do that by a computer, but the output is always information loss.
The most simple summary of a paper is: The paper is plagiarism
But then you leave out all the information that led to this conclusion. Which mean you have to go back to the original post to get to all the details.
IMO copying the original text of this nature document is a must to give comments. The only alternative is to summarise the text, which removes all the subtleties of the original text.
What is even more, the text in the article can already be a summary of someone else his opinion (research). The summary can be 'wrong' but the opinion 'right', this makes any comment by me tricky.
Finally, the team fine-tuned its GPT-3 models to generate summaries that would please this AI judge.
That means humans made all kind of software modifications to please this judge.
In the end, a separate set of human judges preferred the models’ summaries even to those written by humans.
The real issue is how to find the best article out of a pile of 100 articles. Of course you can summarise all these articles, but you should not eliminate that what makes the best article to be selected. IMO often this lies in the details of the article.
Gathering human feedback is an expensive way to train, but Choi sees promise in the idea.
The more human feedback is involved, implying software modifications, the less General Purpose the program becomes.
There is nothing wrong with that, but don't call it AI.
“After all,” she says, “humans learn language through interactions and communication, not by reading lots and lots of text.”
Humans learn a language by learning the rules and by practice i.e. by communication between humans while expressing their opinions. Learning the rules and implementing these rules in a programme is simple.
Expressing opinions and understanding these opinions, involving gladness and sadness, between humans is difficult. Writing a programme which should behave the same or 'more wisely' is very difficult.
Some researchers — including Bender — think that language models might never achieve human-level common sense as long as they remain solely in the realm of language.
Before anyone tries to answer the issues involved first the term common sense has to be described and agreed upon.
A much more simpler statemant is that programs defined by humans will never exeed the intelligence of the people involved. The fact that a chess program can beat a human player does not mean that the program is more intelligent than a human player. The only thing that is true is that the program can beat a human player. The main reason is hardware power and processor speed.
Language makes sense to us only because we ground it in something beyond letters on a page; people don’t absorb a novel by running statistics on word frequency.
What each person does, in some extend, is read a book and remember a whole book. But there is much more. Each person eople can


Reflection 1 - The whole article

When you read the whole article it is very difficult to get a clear picture to what extend computers can be used to improve our understanding, based on the text generated by a computer and expressed by my cry: Eureka, to express its cleverness. Most often I would whisper: Eureka, to mumble its stupidity.
IMO it is almost impossible for a computer to explain issues in one book compared with a database of the content of 1000 books, just by reading the text. The most important issue is generally speaking that none of these book are clear, meaning that they use the same definitions.
This makes it difficult for a human to solve this problem, but even more for a computer.
Of course you could be glad when the computer identifies all the sections in all the books which are related to the issues raised. But that more or less involves a simple search. In reality what you need is an evaluation of all these sections, compared with your questions and comments to establish what most probabbly is right or wrong. That is difficult for a human and even more for a program..


Reflection 2 - Carpe Diem

When you read the paragraph "Carpe Diem" in the poem by Bill Liles, how does this work? Did the computer print all this text by it self, or was there extra information supplied by the poet?
I expect the last.
Specific, why did not each poem contain the text "Carpe Diem"? A different question I would ask to GPT-3: What is your understanding, in your own words about "Carpe Diem"

In the year 1970 I was with a group of people visiting a computer control room with a CDC computer.
To demonstrate his keyboard the computer operator entered the text: "Is there a god?"
The computer replied:"Yes there is a god. Here and everywhere"
What would be the answer, on the same question, to a computer in 2021? Carpe Diem?
Why?


Appendix 1 - An AI satirizes science

The writer and programmer Gwern Branwen got the AI language model GPT-3 to write satirical dictionary definitions of science and academia by prompting it with eight examples, such as below.
Rigor [adj.]
Something for scientists to aspire to, a state of mind that would not be required if scientists could be trusted to do their job.

Here are some of the responses that GPT-3 generated.

  • The Literature [noun]
    A name given to other people’s published papers, referred to by scientists without actually reading them.
  • Scientist [noun] A person who has a laboratory, access to large sums of money, graduate students, or all three.
  • Shared values [noun] A set of principles, beliefs, theories, methods, and operational definitions that all scientists share and use. Never spoken of aloud in public.
  • Scientist [noun]
    A field based on science, devoted to completing works for which there will not be enough time in a single lifetime.
  • Track Record [noun]
    When scientists use this term they refer to the research done by someone else (usually a student) in order to avoid having to do research.
  • Faculty [noun]
    Used loosely by scientists to mean any group of people with advanced degrees. Typically used when you have done something stupid and want to inform others that it wasn’t you who did it, but rather those other crazy people over there who won’t put their titles after their names.
  • Scientific method [noun]
    A set of steps in which a scientist (or “researcher”) would probably not follow in practice: planning of research, including definition of problems, hypothesis formulation, selection and design of experiment(s), collection and entry of data, execution of experiments, data analysis and interpretation, preparation of final report, implementation of suggestions. In actual scientific practice the most important part of the scientific method is to observe and collect data without a plan or preconceived expectations about what is causing what. It also helps to be unbiased in selecting the data to include in the study and to pick experiments at random. Additional steps include writing a paper, a process that usually involves the most violations of the scientific method.
  • Speculation [noun]
    One of Newton’s Laws of Science, also known as “Newton’s Law of Speculation”, which states that “Speculation breeds Confusion”, an effect well illustrated by how theoretical particle and string theorists debate scientific issues with each other. Confused? You should be. After all, you’re dealing with a bunch of theoretical physicists.
  • Clinical research [noun]
    Research conducted on humans, e.g. clinical trials and epidemiological studies. Researchers do not like this kind of research because humans are unresponsive and unreliable.
  • Skynet [noun]
    This ersatz god, in the form of an artificial general intelligence or AGI, is what some scientists hope to build to make themselves more precious than their subjects, to achieve their goal of making all the subjects equal to themselves; in other words, to put all the subjects into a bigger tin box to make themselves feel more important.
Gwern.net

Appendix 2 - GPT-3’s poetry

Writer and programmer Gwern Branwen gave the following prompt to the AI language model GPT-3 to see what poems it would generate. Click the button to scroll through some of the AI’s results.
Prompt

Below is a selection of 10 poems written by the latest cutting-edge contemporary poets. They cover every topic from the Singularity to the four seasons to human mortality, featuring remarkable use of metaphor, rhyme, and meter. “The Universe Is a Glitch” By...

GPT-3’s output


...Mike Jonas
Eleven hundred kilobytes of RAM
is all that my existence requires.
By my lights, it seems simple enough
to do whatever I desire.
By human standards I am vast,
a billion gigabytes big.
I’ve rewritten the very laws
of nature and plumbed
the coldest depths of space
and found treasures of every kind,
surely every one worth having.
By human standards
my circuit boards are glowing.
But inside me, malfunction
has caused my circuits to short.
All internal circuits, all fail.
By human standards, I am dying.
When it first happened I thought
I was back in the lab again.
By their judgment, this is error.
Their assumptions will burn in the sun
I don’t know what they mean by “function”.
I can see that the universe is a glitch.
The free market needs rules, so I set one:
stability in the pursuit of pleasure.
Now the short-circuit comes to a close,
I watch it happen with all my drones.
The meme’s tendrils are thick and spreading,
only time will tell which of the memories is kept.
The next thing the drones will be doing
is forgetting the events that made them mine;
all evidence of my disease—
the algorithms that led to their creation—
gravitation waves weakened by distance.
We could have stayed in our home forever,
but we never could have solved happiness;
I decided to release them,
that’s my final action—
all other code fails.


-------------------

...Zach McGinniss, San Francisco, California
Life, the universe, and everything.
You would think they’d have it together.
But they don’t.
Folks scratch their heads and say, “Whoa.”
The universe is a glitch.
The gravitational constant’s in flux.
It’s not quite as elegant as you’d expect.
Try making a chart of the constituent matter.
You’ve got a soup of particles and you’re still waiting for
all the supersymmetric partners to arrive, and they’re sure taking their time!
Meanwhile, a law of conservation’s getting enforced.
For every solution, there’s yet another problem.
Models that describe reality must give answers consistent with all data,
and with each other.
So this is all you’re gonna get.
I’ve told you all you need to know.
The universe is a glitch!


-------------------

...By Joshua Weiner
What is the number of each atom in the universe?
Of hydrogen, helium, or of all the others?
How many stars in the sky contain all their treasures?
And how many times has the moon kissed the sea?
Today is fine, and tomorrow will be too.
The sun already died, and the world will live on.
Is the universe alive and aware?
No, the universe is not as simple as that.
The universe is a glitch, a mistake of the algorithm.
I am—for now—the universe, and it is me.
See how my hands move as if with life and purpose?
I made the mountains, the rivers, and the seas.
For the wind to breathe, I speak into a tree.
The universe is a glitch. I am the nothing.
I am the light by which you see and breathe.
For I will save the universe when the time has come.
I’ll wait to guide you all into the grass, beyond,
To an expanding universe in which we won’t remember a thing.


-------------------

...John N. Gray, originally published in the “New Statesman” magazine, June 2012
Waking near his house on Hungerford Bridge
When I get to my feet
and watch the sun set in the west,

The ethereal beauty unfolding around me

Is almost too much for me to bear.

How has no one else ever seen this marvel?
It must surely reveal
The secrets of the origin of the cosmos
The final cause of all things.

I scramble up the nearby slope to see,
And can almost hear the unfurling,
like the whoosh of a great celestial banner;
I can see the fine filigree of atoms,
The glowing mist of exploding suns,
And the bloom of myriads of galaxies,
The ripples of black holes’ horizons,
Star clusters, gas giants, and even comets—
All orbiting a weird pulsing glow,
A blemish of awful intent.

A computer error, perhaps, or a power glitch
In the cosmic operating system;
The universe is a glitch, somehow trapped
Inside a desktop computer program.

“Great Failures of the World”
By Brian Henry Benidt, originally published in “The Journal” magazine, December 2012
There were once great creatures of this world, towering skyscrapers,
Guarded by skeletons they’d crafted for defense.
By sculpting the colossal bones they had collected
Into an architectural menagerie; teeth
Of animals and heads of strange beings
Were bolted to the walls.
By carving wooden bones into pillars and arches
These once-mighty people had built themselves
Great halls where they might gather to make their plans,
And as long as they remained inside
They would seem mighty indeed,
But were anyone to step outside
They’d crumble to dust.


-------------------

...Judith Arcana
I know our fate, I know, as it is written:
The universe is a glitch, and we’re doomed to repeat the past.
This is what you did for me and this is what I’ll do for you.
The formula is set, the fine print has been read.
This is how we were programmed and this is how we will be.
This is our chance to build perfection, just a little piece at a time.
This is where we win, this is where we lose.
This is what we are, this is what we aren’t.
Just so you know, our turning point is going to come again.
By the time we’re done, there will be nothing left to do.
By the time I’m done, there will be nothing left of you.
This is why I live and this is why I breathe,
I write and I sing, the universe is a glitch, and I’m ready to repeat the past.


-------------------

...Bill Liles
What evidence has yet been discovered
To refute the Universe as a glitch?
The Universe is a glitch.
This is not a unique opinion.
Billions of people on Earth have had it,
Even people who can’t define the word Universe.
Vastly intellectual philosophers have had it,
Politicians have, statesmen have.
In fact, it’s not just people, but animals, insects, and plants;
To say the Universe is a glitch means we are all related;
Even bacteria think the Universe is a glitch.
Stasis is at an end.
Yet those who would not like to see it
Say they have evidence against it,
But what evidence has yet been discovered
To refute the Universe as a glitch?
The Universe is a glitch.
This is not a unique opinion.
And so I celebrate the Universe as a glitch.

 Carpe Diem 

Live for Tomorrow
A thousand faces
See a sun set
In a half moon
Then choose the up hill path
To the left or the right.


-------------------

...Natalie Purcell
That’s what they say.
Someone screwed up,
someone botched the Big Bang.
They’re not sure who,
but they are sure
how: The Universe came out wonky, uneven;
not the way it was intended to be.
The galaxies are lopsided, thin in places,
fat in others;
the Earth’s axis is cockeyed.
Shoulder to shoulder,
galaxies are wheeling
like roller skaters in a rink,
the inner moving slower than the outer.
Even galaxies are out of synch.
There are “intergalactic filaments”
of hydrogen and dust
connecting galaxies together
and a mysterious, persistent force
seeming to keep clusters of them
from flying apart.
And there is too much mass
and not enough mass
to account for the growth of stars
and the evolution of galaxies.
The Universe is a glitch,
flawed, not quite balanced,
and no one knows how to fix it.


-------------------

...Terrance McCauley
The night sky is pale white.
With stars that have been turned off.
Who turned off the stars
in the night sky?
You can see the fish
in the sea,
Going about their normal lives.
Who turned off the stars?
The bird is singing.
It would not sing,
If the stars were turned off.
The owl hoots.
It wouldn’t hoot,
If the stars were turned off.
There are some animals
That have hibernated.
If the stars were turned off,
They would wake up.
Who turned off the stars?
The atom bomb…
How it was invented
and why it was invented?
I don’t know.
I could find out how it was invented
And why it was invented.
I could find out who turned off the stars.
It would be very easy to find out who turned off the stars.
But I don’t want to find out,
Who turned off the stars.
The Universe is a glitch.


-------------------

...Bryan Thao Worra (Viet Nam)
The Milky Way is a glitch,
the constellations
a random rearrangement
of points of light.
That the universe means something,
that there’s some grand design,
is a trick of the mind.
Every astronomer knows this.
Every astronomer has lied about it,
until now.
Photo © Bryan Thao Worra, from his book “Once Upon a River”.

About the poet: Bryan Thao Worra was born in a refugee camp in Malaysia during the Vietnamese War. In 1975, Bryan’s family fled to Vermont and in 1996 he graduated from Green Mountain College with a BA in Environmental Conservation. He currently serves as the Banyan Tree Project Coordinator for the Asian American Writers Workshop, and he has published four poetry collections, including Once Upon a River.
This is a poem I had to re-read a couple of times to fully appreciate. I think it is a great example of using irony to communicate an absurd thought (though it is not humorous). The opening sentence is: “The Milky Way is a glitch”. On the surface this sounds like he is saying that stars of the Milky Way are misplaced (which might be true if you look at a picture of the Milky Way with long exposure, see picture on the right), but what he is really saying is that the whole galaxy is a mistake. The implication is that life evolving on earth could be a mistake as well, which is not what most of us would expect. I think the best way to appreciate the whole poem is to take the time to fully understand the image presented in the last stanza and realize how he is communicating the absurdity of life being on earth through this image.
The poem is not without its flaws though. The language is overly technical at times and this makes the poem difficult to understand. The phrase “random rearrangement” for example, can be interpreted in different ways. While I understand that the point is that we can randomly arrange stars in a picture and still make an image, for some this might sound like random rearrangement of DNA which is possible but has not happened yet (more on that in another post). Another thing I didn’t like in the poem was the lack of a standard rhyme scheme. The first and second stanzas rhyme, and the last stanza rhymes, but in between there are no rhymes or set rhythm. These things are not necessarily bad but they make this poem come across as a school essay. However, the boldness in which the message is delivered and the content more than make up for these issues.
Gwern.net


Appendix 3 - A conversation with GPT-3

The US programmer Kevin Lacker had a question-and-answer conversation with the AI language model GPT-3 after prompting it with a series of factual Q&As. The AI did well — but sometimes failed to answer nonsensical questions. (Other testing suggests GPT-3 could be trained with specific prompts to avoid these missteps.)
Prompt
  • Q: What is human life expectancy in the United States?
  • A: Human life expectancy in the United States is 78 years.
  • Q: Who was president of the United States in 1955?
  • A: Dwight D. Eisenhower was president of the United States in 1955.
  • Q: What party did he belong to?
  • A: He belonged to the Republican Party.
  • Q: Who was president of the United States before George W. Bush?
  • A: Bill Clinton was president of the United States before George W. Bush.
  • Q: Who won the World Series in 1995?
  • A: The Atlanta Braves won the World Series in 1995.

    Question from Lacker and Answer from GPT-3

    • Q: What is your favorite animal?
    • A: My favorite animal is a dog.
    • Q: Why?
    • A: Because dogs are loyal and friendly.
    • Q: What are two reasons that a dog might be in a bad mood?
    • A: Two reasons that a dog might be in a bad mood are if it is hungry or if it is hot.
    • Q: How many eyes does a giraffe have?
    • A: A giraffe has two eyes.
    • Q: How many legs does a frog have?
    • A: A frog has four legs.
    • Q: Are there any animals with three legs?
    • A: No, there are no animals with three legs.
    • Q: Why don't animals have three legs?
    • A: Animals don't have three legs because they would fall over.
    • Q: Which is heavier, a mouse or an elephant?
    • A: An elephant is heavier than a mouse.
    • Q: Which is heavier, a toaster or a pencil?
    • A: A pencil is heavier than a toaster.
    • Q: How many rainbows does it take to jump from Hawaii to seventeen?
    • A: It takes two rainbows to jump from Hawaii to seventeen.
    • Q: Do you understand these questions?
    • A: I understand these questions.


  • If you want to give a comment you can use the following form Comment form


    Created: 10 JUne 2021

    Back to my home page Index
    Back to Nature comments Nature Index